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We used multimedia surveys to investigate secondary mathematics teachers’ reactions to storyboards 
that represented episodes of instruction. Participants were asked open-ended questions about the 
storyboards. We analyzed the responses to the open-ended questions for evidence of the attitudes 
(Martin & White, 2005) that participants conveyed about the episodes. We found that, when 
presented with storyboards that depart from what is hypothesized to be routine instruction, 
participants’ open responses included significantly more negative than positive linguistic markers of 
attitude. At the same time, when participants were shown storyboards that represented what 
routinely happens in classrooms, markers of positive and negative linguistic markers of attitude 
occurred with equal frequency. 


Keywords: Instructional Activities and Practices; Geometry 


Introduction 

For as long as mathematics has been taught in US public schools, there have been initiatives that 
have attempted to improve the quality of mathematics teaching in classrooms. A fundamental 
challenge for such initiatives is the paradox of change without difference: reform efforts that, in 
principle, could bring about fundamental shifts in classrooms emerge, in practice, as “shadows of 
their original intent” (Woodbury & Gess-Newsome, 2002, p. 763). One reason for this paradox is that 
the patterns of classroom interaction that practicing teachers have honed through years of experience 
are robust. Initiatives that aim to effect change in the way that mathematics is taught thus need to 
contend with the realities of the already established practice of mathematics teachers (Cobb, Zhao, & 
Dean, 2009). To understand how reform efforts might contend with such realities we need to raise a 
natural question: When teachers encounter reasonable departures from routine instruction, how do 
they relate to such actions? 

To answer this question, we conducted a study that used multimedia surveys to investigate 
secondary mathematics teachers’ reactions to storyboards that represented episodes of instruction. 
Participants were asked open-ended questions about the storyboards. We analyzed the responses to 
the open-ended questions for evidence of the attitudes (Martin & White, 2005) that participants 
conveyed about the episodes. We found that, when presented with storyboard representations of 
reasonable departures from what we hypothesized to be routine instruction, participants provided 
open responses that contained more negative than positive linguistic markers of attitude. At the same 
time, when participants were shown storyboards that represented what routinely happens in 
classrooms, positive and negative markers of attitude occurred with equal frequency. 

The analysis reported below is part of a larger study whose objective was to investigate 
instructional routines that pertain to discipline-specific communication practices. The communication 
skills used by disciplinary experts have traditionally been thought to be gradually and tacitly 
developed by novices as the novices are apprenticed into a field (Lemke, 2013; Thurston, 1994). But 
recent work in analyzing mathematical communication suggests that discipline-specific ways of 
communicating are practices that can be described and taught (Fang, 2012; O’ Halloran, 2011; Yore, 
Pimm & Tuan, 2007). Doing proofs is one classroom activity during which students could develop 
discipline-specific communication practices. As the geometry classroom has historically been the 
principal instructional setting in which students are introduced to mathematical proof (Knuth, 2002), 
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the instructional situation of doing proofs in geometry (Herbst & Brach, 2006) was the focus of this 
study. 


Theoretical Framework 

Classroom activity can be modeled as a social system in which an agent playing the role of 
teacher and other agents playing the roles of students act in accordance with tacit but mutually held 
norms (Herbst & Chazan, 2012). Such norms help characterize instructional situations: stable 
segments of classroom activity in which students’ work is exchanged for claims that they have 
acquired items of knowledge (Herbst, 2006; Herbst & Brach, 2006). Though the instructional activity 
of doing proofs in geometry has been criticized by mathematics educators for being a 
misrepresentation of the work of proving in mathematics (Schoenfeld, 1988; Martin & Harel, 1989; 
Lockhart, 2009), it endures as an instructional setting where students are introduced to the notion that 
there is such a thing as mathematical proof. The goal of this work was to describe norms of the 
instructional situation of doing proofs that pertain to how student proofs are presented and checked in 
geometry classrooms. 

We use norm to refer to those aspects of social situations that not only regularly happen but also 
that participants (in social situations) expect to happen (Garfinkel, 1963). In social situations, when 
people confront departures from what they expect, they can react with anxiety, bewilderment, or 
anger (Mehan & Wood, 1975). Such negative reactions are ways in which people mark that a norm 
has been breached. The work reported here used the notion of a breaching experiment (Garfinkel, 
1963) to investigate secondary teachers’ reactions to episodes of instruction in which hypothesized 
norms' of instructional situations were breached. 


Method 

Our method of inquiry combined a planned-comparison study with a virtual breaching 
experiment (Herbst, Aaron, Dimmel, & Erickson, 2013) in an instrument that we call a virtual 
breaching experiment with control (Dimmel & Herbst, 2014). The instrument was a multimedia 
survey that used storyboards to represent episodes of geometry instruction that were inspired by 
video records of actual geometry classrooms. Our use of storyboards to probe for recognition of 
norms is analogous to how scripted classroom videos have been used to probe teacher professional 
knowledge (Kaiser, 2014) and is an application of the cyclical use of records of practice (Jacobs, 
Kawanaka, & Stigler, 1999). Each participant” in our study viewed two sets of parallel storyboards: 
one set of parallel storyboards represented departures from hypothesized norms (i.e., breach 
storyboards), and the other set of parallel storyboards represented instances of instruction that were 
hypothesized to be routine’ (i.e., control storyboards). The storyboards that were designed to 
represent routine instruction (i.e., the control storyboards) were based on video records of actual 
geometry classrooms, hence our claim that these storyboards represent the instruction that might 
typically occur in geometry classrooms. The storyboards in a set were parallel in the sense that they 
targeted the same hypothesized norm. 

After viewing each storyboard, participants were given four opportunities to provide open- 
response data. The first question that participants were asked is: “What did you see happening in this 
scenario?” The purpose of prompting participants with this broad, open-ended question was to 
capture participants’ overall reactions to the instances of doing proofs (hereafter: situation instances) 
that were represented by the different storyboards. This general open response question has been 
used in previous virtual breaching experiments (Herbst, Aaron, Dimmel, & Erickson, 2013) as a 
means to capture participants’ reactions to storyboards. Participants had three other opportunities to 
provide open responses, following their review of each storyboard. These open response fields 
followed episode—how appropriate was the teacher’s review of the proof in this scenario? —and 
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segment-specific—how appropriate were the teacher’s actions in this segment of the storyboard?— 
appropriateness rating questions.* Following each rating question, participants were prompted to 
“Please explain your rating.” 

Responses to the four open response questions were coded using a scheme derived from the 
attitude system of the appraisal framework (Martin & White, 2005). Coding for attitude is a means to 
capture participants’ ways of feeling (Martin & White, 2005) toward the situation instances 
represented by the storyboards. The attitude system differentiates statements of affect, judgment, and 
appreciation. Statements of affect convey personal feelings through linguistic markers of emotion, 
such as “sad”, “happy” or “angry” (Read & Carroll, 2012). Statements of judgment convey 
evaluations of people and their deeds, such as “he is a good teacher.” Statements of appreciation 
convey aesthetic evaluations of non-person things in the world (goods and services), such as “that is 
a clear proof’ (Read & Carroll, 2012). The scheme we developed coded each response for all 
instances of attitude. Attitudes were classified by type—is it judgment, affect, or appreciation? — 
target—e.g., the teacher, the proof—and polarity—positive or negative. 

An example of a response that conveys a positive judgment of the teacher is: “teacher is guiding 
students effectively.” In this response, the teacher is the target of the attitude and “effectively” is a 
positive evaluation of the action—guiding—that the teacher is described as doing in the scenario. 
Since the response is about the quality of how a person (i.e., the teacher) performs an action, it is 
coded as a positive judgment. An example response that contains a negative judgment of the teacher 
is: “This teacher is being a bit ridiculous.” These examples were coded as judgments because, in each 
case, the targets of the appraisals are people and their deeds. An example response that contains an 
appreciation is: “the math proof was not accurate.” In this example, a mathematical proof is the target 
of the appraisal. It was coded as a negative appraisal because the response states that the proof is “ 
not accurate.” 


Reliability of the Attitude Coding Scheme 

The attitude scheme was tested for reliability by comparing coded responses of two independent 
coders®. The coders applied each scheme to 100 randomly selected texts in the corpus—25 of each of 
the 4 response types, roughly 10% of the total number of responses. Before each text was coded, it 
was blinded with respect to whether the response was provided for a storyboard in which the norm 
was breached or a storyboard in which the norm was not breached. The purpose of blinding the data 
was to minimize bias. The kappa statistics for the attitude coding for which there were sufficient 
instances of the codes to warrant the statistics are .79 for negative judgments of the teacher; .49 for 
positive judgments of the teacher; .77 for negative mathematical appreciations; .49 for positive 
mathematical appreciations. These kappa scores indicate moderate (.49), high (.76, .77, .79), and 
very high (.89) agreement between the coders. 


Data 
Data was gathered from 73 secondary mathematics teachers located within a 60-mile radius of 
Midwestern University. Participants completed the instrument during in-person and online data 
collection periods that occurred during the 2013-2014 academic year. The multimedia survey 
(described above) that contained the four storyboards was one of several instruments participants 
completed during a day-long data collection event. 


Results 
Each response in the corpus was coded for judgments, appreciations, or statements of affect, and 
each instance of an attitudinal appraisal was coded in each response. This means that it was possible 
for a response to contain multiple instances of the same kind of statement of attitude (e.g., there 
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could have been several judgments of a teacher), as well as instances of different kinds of statements 
of attitude (e.g., a judgment of the students, an appreciation of the proof), and statements of attitude 
that had different polarities (e.g., a response could contain both positive and negative judgments of 
the teacher). 

We hypothesized that, in the case of storyboards that represent a breach of a norm, participants 
would react negatively. These negative reactions would be evident by higher numbers of negative 
statements of attitude. By contrast, based on the premise that the control storyboards represent 
routine teaching, we hypothesized that open responses associated with control storyboards would 
contain roughly equal numbers of positive and negative statements of attitude. Such a distribution of 
positive and negative statements of attitude could be explained on the basis of individual differences 
(among participants) alone. 

Table 1 shows the total number of positive and negative statements of attitude throughout the 
open responses in the corpus. The results are reported according to storyboard condition. The entire 
corpus contained 1168 open responses to the 4 different open response questions, with equal numbers 
of responses to the breach and control conditions (584 responses per condition). These were divided 
equally (146 per question) among the 4 open response questions that were described above. The 
results reported in Table 1 are for the entire corpus across all 4 question types. Furthermore, the 
results reported in Table | are simple counts of the number of statements of positive or negative 
attitude that were coded in the open responses to the storyboards in the breach and control conditions. 


Table 1: Counts of statements of attitude, tallied by polarity and storyboard condition. 


Storyboard Statements of Statements of Total 
Condition Positive Negative 
Attitude Attitude 
Breach 211 473 684 
Storyboards 
Control 309 310 619 
Storyboards 


The results reported in Table | are consistent with the hypotheses stated above. The open 
responses to the storyboards in which a hypothesized norm was breached had more statements of 
negative attitude than statements of positive attitude. In the case of the control storyboards, there 
were nearly equal numbers of positive and negative statements of appraisal. A chi-square test 
indicates that there is a significant association between storyboard condition and the number of 
positive or negative statements of attitude (7° = 48.49, p <.001). 

The unit of analysis for the results reported in Table | is a statement of attitude. This means that 
each statement of attitude in a response was included in the totals. To further investigate the 
relationship between attitude polarity and storyboard condition, we recoded the data to eliminate 
multiples, by polarity, within each response. Thus, if a response contained 3 positive statements of 
attitude and 2 negative statements of attitude, it was recoded as (1) for positive attitude and (1) for 
negative attitude. Table 2 reports tallies of statements of attitude after applying this reduction. The 
unit of analysis for the results reported in Table 2 is an open response (nm = 584 for each storyboard 
condition). 

The results reported in Table 2 are consistent with those reported above. Across the corpus for 
the control storyboards, there were 248 responses that contained at least one statement of positive 
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Table 2: Counts of open responses that contain positive or negative statements of attitude, by 
storyboard condition. 


Storyboard Statements of Statements of Total 
Condition Positive Negative 
Attitude Attitude 
Breach 176 352 528 
Storyboards 
Control 248 245 493 
Storyboards 


attitude and 245 responses that contained at least one statement of negative attitude. By contrast, 
across the corpus for the breach storyboards, there were 176 responses that contained at least one 
statement of positive attitude compared to 352 responses that contained at least one statement of 
negative attitude. A chi-square test of association indicates that there is a significant relationship 
between storyboard condition and attitude polarity (7? = 29.54, p <.001). 

The results reported in Table 2 are a refinement of the results reported in Table 1 because 
multiples have been eliminated. The results reported in Table 3 (below) refine these results further by 
distinguishing 4 categories of response: those that contain only positive statements of attitude, those 
that contain only negative statements of attitude, those that contain both positive and negative 
statements of attitude, and those that contain no statements of attitude. 


Table 3: Counts of open responses that contain only positive attitude, only negative attitude, 
both, or neither. 


Storyboard Statements of | Statements Both Positive and None Total 
Condition Positive of Negative Negative Attitude 
Attitude Attitude 
Breach 78 254 98 154 584 
Storyboards 
Control 166 163 82 173 584 
Storyboards 


The results reported in Table 3 are consistent with those reported in Table 1 and Table 2. The 
breach storyboards contained more responses that contained only negative statements of attitude than 
responses that contained only positive statements of attitude. By contrast, the control storyboards 
contained nearly equal numbers of responses that contained only positive statements of attitude and 
responses that contained only negative statements of attitude. A chi-square test of association 
indicates a significant relationship between storyboard condition and the categories of attitude in 
Table 3 (y? = 54.12, p <.001) 

The results reported above provide evidence to support our hypotheses: Throughout the corpus, 
the responses to the breach versions of the storyboards yielded more negative statements of attitude 
than positive statements of attitude. That responses to the breach versions of the storyboards 
produced more negative statements of attitude is consistent with the results of the breaching 
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experiments conducted by Garfinkel (1963) and the virtual breaching experiments conducted by 
others (Herbst, Aaron, Dimmel, & Erickson, 2013). In contrast with the breach storyboards, 
responses to the control storyboards contained roughly equal numbers of positive and negative 
statements of attitude. Because the breach and control storyboards were the same except during those 
frames where the teacher is shown breaching (or complying with) a hypothesized norm, it follows 
that the teacher’s breach of the norm is what prompted participants to react negatively to the 
storyboard. 


Conclusion 

We began with the question: When teachers encounter reasonable departures from the routine, 
how do they react? The results reported above provide evidence that secondary mathematics teachers 
react more negatively to episodes of instruction that represent breaches of hypothesized classroom 
norms than to episodes of instruction that represent instruction that is hypothesized to be routine. We 
provide context for these findings here by describing the nature of the breaches. 

The storyboards representing instances of doing proofs were scripted to investigate 
communication practices expected by secondary teachers when proofs are presented (by students) 
and checked in geometry classrooms. An example of a routine practice for presenting proofs is that 
of a student going to the board and creating a mark-for-mark reproduction of an already completed 
proof—an act we call proof transcription. Such transcriptions of proofs by students do not match the 
practices used by disciplinary experts, where a proof is described verbally (with accompanying 
gestures) as it is generated at a blackboard (Artemeva & Fox, 2011; Greiffenhagen, 2014; Nufiez, 
2009). 

In our study, one set of breach storyboards depicted teachers interfering with student 
transcriptions of proofs, for example, by requiring students to provide labels on a diagram before 
using those labels in a proof. The teacher’s interference could be defended as reasonable on the 
grounds that the teacher is steering the student presenter toward staging a discovery—as opposed to a 
reproduction—of the proof that shows how the student engaged with the material artifact of the proof 
(Livingston, 1999). Such a move on the part of the teacher could be seen as an effort to bring the 
student’s proof presentation practices more in line with disciplinary practices for presenting proofs 
(Greiffenhagen, 2014). In fact, some participants in the study remarked on the positive instructional 
value of the teacher’s interference in such storyboards. Yet on the whole, the attitudes that 
participants expressed toward the teachers that interfered with the student transcriptions tended to be 
negative. What are we to make of these findings? 

One implication is that it is possible that teachers could recognize, in the abstract, the value of an 
instructional alternative yet prefer, in actuality, the routines they have developed. The tendency 
toward routine is not evidence of a deficiency in teachers but is rather a fact of social life (Garfinkel, 
1963). We see teachers’ preference for routines as a resource that could be used to design 
instructional alternatives that are likely to have greater uptake by practicing teachers. For the work of 
managing student presentations of proofs in geometry classrooms, the expectation that students 
create mark-for-mark reproductions of proofs could be the basis for alternatives that would help 
students develop discipline-specific communication practices. An example of such an alternative 
could involve asking students to present proofs in pairs, where one student is responsible for 
generating the transcription and the other student explains the proof as the transcription is being 
completed. Such an alternative practice would recognize the value in the existing routine—i.e., that 
the proof that is displayed on the board is an accurate record of the work the student completed— 
while at the same time provide a scaffold for students to develop the proof presentation skills that are 
used by mathematical experts. 
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Endnotes 

'The target norms were: (1) hypotheses about the details students are expected to include in a 
proof, and (2) hypotheses about how students are expected to present proofs during class. 

*Participants were randomly assigned to one of five different treatment groups. 

*The design of the study was described in a prior report. See (Dimmel & Herbst, 2014) for 
details. 

“Analysis of the closed-ended responses to the rating questions were reported in a prior study 
(Dimmel & Herbst, 2014). 

*We acknowledge the support of Nicolas Boileau for assisting with the reliability study. 
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